Cascade Adversarial Machine Learning Regularized with a Unified Embedding

نویسندگان

  • Taesik Na
  • Jong Hwan Ko
  • Saibal Mukhopadhyay
چکیده

Deep neural network classifiers are vulnerable to small input perturbations carefully generated by the adversaries. Injecting adversarial inputs during training, known as adversarial training, can improve robustness against one-step attacks, but not for unknown iterative attacks. To address this challenge, we propose to utilize embedding space for both classification and low-level (pixel-level) similarity learning to ignore unknown pixel level perturbation. During training, we inject adversarial images without replacing their corresponding clean images and penalize the distance between the two embeddings (clean and adversarial). This additional regularization encourages two similar images (clean and perturbed versions) to produce the same outputs, not necessarily the true labels, enhancing classifier’s robustness against pixel level perturbation. Next, we show iteratively generated adversarial images easily transfer between networks trained with the same strategy. Inspired by this observation, we also propose cascade adversarial training, which transfers the knowledge of the end results of adversarial training. We train a network from scratch by injecting iteratively generated adversarial images crafted from already defended networks in addition to one-step adversarial images from the network being trained. Experimental results show that cascade adversarial training together with our proposed low-level similarity learning efficiently enhance the robustness against iterative attacks, but at the expense of decreased robustness against one-step attacks. We show that combining those two techniques can also improve robustness under the worst case black box attack scenario.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ularized with a Unified Embedding

Injecting adversarial examples during training, known as adversarial training, can improve robustness against one-step attacks, but not for unknown iterative attacks. To address this challenge, we first show iteratively generated adversarial images easily transfer between networks trained with the same strategy. Inspired by this observation, we propose cascade adversarial training, which transf...

متن کامل

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on L2 and L∞ distortion metrics. However, despite the fact that L1 distortion accounts for the total variation and e...

متن کامل

Cycles in Adversarial Regularized Learning

Regularized learning is a fundamental technique in online optimization, machine learning and many other fields of computer science. A natural question that arises in these settings is how regularized learning algorithms behave when faced against each other. We study a natural formulation of this problem by coupling regularized learning dynamics in zero-sum games. We show that the system’s behav...

متن کامل

CNN Based Adversarial Embedding with Minimum Alteration for Image Steganography

Historically, steganographic schemes were designed in a way to preserve image statistics or steganalytic features. Since most of the state-of-the-art steganalytic methods employ a machine learning (ML) based classifier, it is reasonable to consider countering steganalysis by trying to fool the ML classifiers. However, simply applying perturbations on stego images as adversarial examples may lea...

متن کامل

Manifold Regularized Deep Neural Networks using Adversarial Examples

Learning meaningful representations using deep neural networks involves designing efficient training schemes and well-structured networks. Currently, the method of stochastic gradient descent that has a momentum with dropout is one of the most popular training protocols. Based on that, more advanced methods (i.e., Maxout and Batch Normalization) have been proposed in recent years, but most stil...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1708.02582  شماره 

صفحات  -

تاریخ انتشار 2017